Enabling soft queries for data retrieval

نویسندگان

  • Hwanjo Yu
  • Seung-won Hwang
  • Kevin Chen-Chuan Chang
چکیده

Data retrieval finding relevant data from large databases — has become a serious problem as myriad databases have been brought online in the Web. For instance, querying the for-sale houses in Chicago from realtor.com returns thousands of matching houses. Similarly, querying ‘‘digital camera’’ in froogle.com returns hundreds of thousand of results. This data retrieval is essentially an online ranking problem, i.e., ranking data results according to the user’s preference effectively and efficiently. This paper proposes a new rank query framework, for effectively incorporating ‘‘user-friendly’’ rank-query formulation into ‘‘data base (DB)-friendly’’ rank-query processing, in order to enable ‘‘soft’’ queries on databases. Our framework assumes, as the ‘‘back-end,’’ the score-based ranking model for expressive and efficient query processing. On top of the score-based model, as the ‘‘front-end,’’ we adopt an SVM-ranking mechanism for providing intuitive and exploratory query formulation. In essence, our framework enables users to formulate queries simply by ordering some sample objects, while learning the ‘‘DB-friendly’’ ranking function F from the partial orders. Such learned functions can then be processed and optimized by existing database systems. We demonstrate the efficiency and effectiveness of our framework using real-life user queries and datasets: our results show that the system effectively learns quantitative ranking functions from qualitative feedback from users with efficient online processing. r 2005 Elsevier B.V. All rights reserved.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Enabling Data Retrieval : by Ranking and Beyond

The ubiquitous usage of databases for managing structured data, compounded with the expanded reach of the Internet to end users, has brought forward new scenarios of data retrieval. Users often want to express non-traditional fuzzy queries with soft criteria, in contrast to Boolean queries, and to explore what choices are available in databases and how they match the query criteria. Conventiona...

متن کامل

A Multi-Objective Genetic Algorithm for Learning Linguistic Persistent Queries in Text Retrieval Environments

Persistent queries are a specific kind of queries used in information retrieval systems to represent a user’s long-term standing information need. These queries can present many different structures, being the “bag of words” that most commonly used. They can be sometimes formulated by the user, although this task is usually difficult for him and the persistent query is then automatically derive...

متن کامل

Investigating the Impact of Authors’ Rank in Bibliographic Networks on Expertise Retrieval

Background and Aim: this research investigates the impact of authors’ rank in Bibliographic networks on document-centered model of Expertise Retrieval. Its purpose is to find out what kind of authors’ ranking in bibliographic networks can improve the performance of document-centered model.   Methodology: Current research is an experimental one. To operationalize research goals, a new test colle...

متن کامل

Fuzzy Query Processing for Document Retrieval Based on GFNGMA Operators

In recent years, geometric-mean averaging operators (GMA operators) have been proposed to overcome the drawbacks of the existing T-operators and averaging operators for handling the Boolean “AND” and “OR” operations in fuzzy information retrieval. However, the GMA operators can not deal with queries represented by generalized fuzzy numbers. In this paper, we present generalized fuzzy number geo...

متن کامل

External Plagiarism Detection based on Human Behaviors in Producing Paraphrases of Sentences in English and Persian Languages

With the advent of the internet and easy access to digital libraries, plagiarism has become a major issue. Applying search engines is one of the plagiarism detection techniques that converts plagiarism patterns to search queries. Generating suitable queries is the heart of this technique and existing methods suffer from lack of producing accurate queries, Precision and Speed of retrieved result...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Inf. Syst.

دوره 32  شماره 

صفحات  -

تاریخ انتشار 2007